DOCUMENT RESUME 



ED 435 731 



TM 030 363 



AUTHOR 



PUB DATE 
NOTE 



CONTRACT 
PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



TITLE 



SPONS AGENCY 



Lane , Eliesh O'Neil; Dietz, James S.; Chompalov, Ivan; 
Bozeman, Barry; Park, Jongwon 

Using the Curriculum Vita To Study the Career Paths of 
Scientists and Engineers: An Assessment. 

Department of Energy, Washington, DC.; National Science 
Foundation, Arlington, VA. 

1999-11-03 

5 9p . ; Paper presented at the Annual Meeting of the American 
Evaluation Association (Orlando, FL, November 3-6, 1999). 
DE-FG02 - 96ER45562 ; SBR-98-18229 

Reports - Research (143) -- Speeches/Meeting Papers (150) 

MF01/PC03 Plus Postage. 

Career Development; ^Careers; Coding; ^Engineers; ^Internet; 
^Research Methodology; *Resumes (Personal) ; ^Scientists 



ABSTRACT 



The usefulness of the curriculum vita (CV) as a data source 



for examining the career paths of scientists and engineers was studied. CVs 
were obtained in response to an e-mail message sent to researchers working in 
the area of biotechnology who were funded by the National Science Foundation 
(55 responses) or listed as authors (industry only) in the "Science Citation 
Index" (19 responses) . In addition, CVs were obtained passively from a search 
of the Internet (30 CVs) . Methodological issues and problems of this data 
collection strategy are discussed, along with the results of an exploratory 
analysis. In sum, despite difficulties with coding and variation in CV 
formats, this collection strategy seems to hold much promise for examining 
career paths. An appendix describes a coding experiment performed to 
investigate ways to code CVs for data collection. A second appendix lists 
coding items that were not found to be reliable. (Author/SLD) 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 



TM030363 



y* 



m 

r- 

un 

m 



p 

w 



Using the Curriculum Vita to Study the Career Paths of 
Scientists and Engineers: An Assessment 1 



Eliesh O’Neil Lane 
James S. Dietz 
Ivan Chompalov 
Barry Bozeman 
and Jongwon Park 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 



Lcuy£_ 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



Research Value Mapping Program 
School of Public Policy 
Georgia Institute of Technology 
Atlanta, Georgia 30332 
http://rvm.pp.gatech.edu 



November 3, 1999 



U S DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 
EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

This document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



• Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 



This paper was prepared for the American Evaluation Association meeting in Orlando, FL, 

November 4, 1999. 



1 The authors gratefully acknowledge the support of the U.S. Department of Energy (DE-FG02- 
96ER45562) and the National Science Foundation (SBR 98-18229). The opinions expressed in the paper 
are the authors’ and do not necessarily reflect the views of the Department of Energy or the National 
Science Foundation. This work was performed under the Research Value Mapping Program 
[http://rvm.pp.gatech.edu] of the Georgia Institute of Technology. The authors wish to thank Jeff Bournes, 
Marie Chesnut, Jungki Kim, Andy McNeil, Seth Sobel, Ryoung Song, and Larry Wilson for their 
assistance. The authors also thank Juan Rogers for his helpful advice. 




Pf BEST COPY AVAILABLE 



2 



Abstract: 



In this paper, we assess the utility of the curriculum vita (CV) as a data source for 
examining the career paths of scientists and engineers. CVs were obtained in response to 
an email message sent to researchers working in the area of biotechnology who were 
funded by the National Science Foundation or listed as authors (industry only) in the 
Science Citation Index. In addition, a number of CVs were obtained “passively” from a 
search of the Internet. We discuss the methodological issues and problems of this data 
collection strategy and the results from our exploratory analysis. In sum, despite 
difficulties with coding and variation in CV formats, this collection strategy seems to us 
to hold much promise. 




2 



3 



Using the Curriculum Vita to Study the Career Paths of Scientists and 

Engineers: An Assessment 



1.0 Introduction 

Scientists’ and engineers’ career trajectories have much in common with other 
professional paths. Motivational factors are not so very different, including income, need 
for achievement and recognition, and desire for “interesting” work. Scientists and 
engineers face many of the same constraints as others, choosing jobs because of a 
spouse’s opportunities, the quality of schools available to children, distance from family, 
and so forth. Thus, the standard models available to labor economists can tell us much 
about scientists and engineers. 

But there are some respects in which scientists and engineers differ dramatically 
from dentists or attorneys or airline pilots. Some are obvious like the peculiar formal 
assets, such as patents and publications, which scientists and engineers bring with them. 
Other assets are less obvious, less formal, but perhaps even more important. Each 
scientist and engineer can be thought of as a unique embodiment of “scientific and 
technical human capital” (S&T human capital), a walking set of knowledge, skills, 
technical know-how and — just as important — a set of sustained network communications, 
often dense in pattern and international in scope. In previous work (Bozeman, Dietz, and 
Gaughan, forthcoming), we outlined S&T human capital as an alternative model for 



research evaluation, originating in response to the limitations of traditional economic and 
state-of-the-art models. 

The S&T human capital model puts more weight on the sustained ability of 
scientists and engineers to enhance their own capabilities and those with whom they work 
than do traditional models. S&T human capital includes not only the researcher’s human 
capital but the social capital he or she draws upon in creating knowledge and interacting 
in various social and professional contexts. It includes not just the educational 
credentials normally recognized in traditional human capital models (Becker, 1962; 
Schultz, 1963) but the researchers’ tacit knowledge (Polanyi, 1967; Polanyi, 1969), craft 
knowledge, and know-how. And, essential to the effective exploitation of all of these 
human capital endowments is the social capital (Bourdieu, 1986; Bourdieu & Wacquant, 
1992; Coleman, 1988; Coleman, 1990) that scientists continually exercise in engaging 
their interests. 

These endowments not only make the study of scientists’ and engineers’ career 
trajectories more difficult (e.g., less amenable to standard labor models) but more 
challenging. When a dentist changes jobs it is of interest chiefly to old and new clients. 
When a scientist or engineer changes jobs the implications are often profound: the 
movement of S&T human capital is, arguably, a vital element of scientific discovery, 
technological innovation, and even economic development. For S&T human capital 
transcends the intellect of any one individual. Thus, individual migration patterns of 
scientists and engineers can be likened more appropriately to the movement of the web of 
S&T human capital they possess — a web that continually manifests new shapes and 
patterns. Thus, for example, if Northern California was known as Dental Floss Valley 



er|c 



4 



5 



there might well be a concentration of healthy teeth. But, for scientists and engineers 
network dependencies imply something altogether different for migration patterns, which 
rightly command researchers’ and policymakers’ attention. 

If the career trajectories of scientists and engineers are often a bit more 
complicated and less predictable than dentists (or airline pilots, or attorneys), they also 
leave more marks along the trail. One of the great, albeit largely unexploited, advantages 
of studying the careers of scientists’ and engineers’ is the near universal reliance on the 
curriculum vita (CV). The utility of CV data for study of S&T human capital is striking, 
at least at first blush. The CV provides not only a clear-cut indicator of movement from 
one work setting to the next but is, in a sense, a representation of certain aspects of S&T 
human capital. Not only does it indicate the skills and knowledge embodied in scientists 
and engineers (through publications and other technical activities) but the professional 
association memberships, consulting, and co-authorship patterns serve as a crude index of 
social capital. 

The CV, unlike other data sources, often recounts the entire career of the scholar 
in some detail. Thus, it is not simply a list of credentials, but an historical document that 
evolves over time capturing changes in interests, jobs, and collaborations. Whether 
viewed as historical record, marketing tool, or scientific resource, it is a potentially 
valuable datum for persons interested in career trajectories or, more generally, science 
and technology studies. Not only is the CV nearly universal, it is in some respect 
standard, and it is relatively easily obtained (sometimes even from the public domain). 
Most important, the CV contains useful, concrete information on the timing, sequence, 
and duration of jobs, work products (e.g., articles, patents, papers), collaborative patterns, 



O 

ERIC 



5 



6 



and scholarly lineage. The CV is, indeed, a rich source of longitudinal data, which lends 
itself especially well to the study of phenomena associated with careers and labor 
fl ows — precisely the target of S&T human capital. 

On the other hand, this proposed method is not without its limitations or 
problems. In fact, several of the advantages to using the CV as a data source can also be 
viewed as disadvantages. First, because the information is self-reported, its accuracy 
requires verification by the researcher. Second, the semi- structured format falls short of a 
purely standardized template, thus risking the elimination of valuable information or the 
inclusion of extraneous non-relevant data. Perhaps most significant, however, is the 
enormous work involved in coding the CV for subsequent data analysis. Not only is the 
coding time consuming, it is also tedious and runs the risk of introducing error due to 
coder fatigue. In some cases it is possible to have as many as 1,200 variables for one CV. 

Despite its limitations, the potential of the CV as a research tool is enormous. Yet 
it has been used only sparingly — and sometimes incidentally — as a research device. We 
seek to address this neglect, to explain it, and to assess the promise and obstacles to a 
research agenda employing the CV as primary data. The development of such a 
methodology undoubtedly would provide a unique and potentially useful alternative for 
evaluating scientists’ and engineers’ careers. 

1.1 Organization of Paper 

A major objective of this paper is simply to determine the extent to which it is 
possible to obtain useful CV data and to assess the utility of various approaches to 
collecting CVs. In section two, we present a review of the literature on scientific careers. 




6 



7 



In section three, methodological issues are presented and discussed with specific attention 
to several coding and data-related issues. In section four, we address issues of validity 
and reliability, data consistency and quality, and CV accessibility. After examining the 
descriptive findings (in section five), we reflect more broadly (section six) on our 
assessment of the utility of a CV-based methodology, including possible strategies for 
improving the quality and consistency of data. In section seven, we present our 
conclusions. 

2.0 CVs, Scientific and Technical Human Capital, and Research Value 

2.1 The Research Value Mapping Program 

Our interest in S&T human capital, and the potential of CVs as a research tool for 
mapping flows of this capital, stems from a general interest in assessing the impacts of 
government-financed research projects. The Research Value Mapping (RVM) Program 
within the School of Public Policy at the Georgia Institute of Technology began in 1996, 
using 30 intensive case studies of research projects as sources of both qualitative and 
quantitative information about the nature and intensity of the projects scientific and 
socioeconomic impacts (see Bozeman, et al., 1999; http://rvm.pp.gatech.edu). 

The Phase I work, sponsored by the Department of Energy’s (DOE) Office of 
Science, focused entirely on DOE-sponsored projects in government and university labs. 
We are beginning Phase II based on continued funding from DOE and with new funding 
from the National Science Foundation. The mission of Phase II is to compare research 
impacts in multiple fields and in the U.S. and France. Whereas Phase I focused on 




7 8 



information developed in the case studies. Phase II will focus on S&T human capital 
impacts, using the CV as one research tool to examine labor flows and career trajectories. 
The core hypothesis of Phase II is that many of the impacts of projects are not easily 
confined within normal project boundaries but occur over considerable time as S&T 
human capital diffuses into other settings. 

We will ultimately test several hypotheses about the connection between the 
characteristics of team-oriented R&D projects and the diffusion of S&T human capital 
via the projects’ “graduates.” A preliminary study Gust begun for this paper) of scientists 
and engineers in the area of biotechnology provides the opportunity to explore the use of 
the curriculum vita as a methodological tool for garnering such information. 

Studies of innovations have already established the importance of close coupling 
for knowledge transfer and the diffusion of innovations in the economy (Rogers, 1995). 
The flow of people from one organization, firm, or group to another is key in the process 
of knowledge exchange. But, despite some good attempts (e.g., Stephan and Levin, 
1997; Simonton, 1997), the extant literature has not managed to fully capture the 
dynamic nature of these flows over time and across research contexts. Careers are 
inherently dynamic — evolving and intersecting in planned and unplanned ways, but 
traditional research evaluation models view them as static or at best, additive and 
cumulative over time. We hope, in the next round of the RVM Program, to address this 
need. 




8 



9 



2.2 Curriculum Vitae and Credit Allocation in Science 

Despite the potential value of CVs as both data collection instruments and sources 
of data on productivity, recognition, career trajectories, and mobility in R&D, there is a 
paucity of theoretical and empirical investigations. One of the few studies that shed some 
light on the importance of CVs is Latour and Woolgar’s anthropological account of the 
social production of scientific knowledge in a neuroendocrinological laboratory (Latour 
and Woolgar, 1986). From their point of view the CV is considered a “balance sheet” of 
a scientist’s past investments and a testament to his or her credibility. Latour and 
Woolgar claim that, apart from accreditation, awards, collaborations, and publications, 
there is an element of the CV that plays a crucial role in estimating a researcher’s total 
value. In their view, value is a three-part notion that incorporates academic rank, 
situation in the field, and geographical location (Latour and Woolgar, 1986). CVs and 
interviews could serve not only as valid sources of information to reconstruct individual 
career trajectories, but also group dynamics and the accumulation of social capital in the 
form of credit. 

Surprisingly, CVs have been used only infrequently to illuminate well-studied 
processes of how the social system of science operates. Given the tradition in sociology 
of science to focus on rewards and credit allocation, for example, the dearth of studies 
employing CVs as rich data sources to trace the award of credit is quite noteworthy. 
Most of the past research on recognition in science has been carried out within some 
economic model of knowledge production. 

The psychosocial mechanisms of reward in science have been, if anything, well 
known and investigated since at least the 1960s. The best known thesis regarding 




9 



10 



'J 



scientific credit is the “Matthew effect” described by Robert Merton (1973a). He defines 
this phenomenon as reflecting an “accumulation of advantages,” so that already 
outstanding scientists receive disproportionately more credit for their contributions than 
younger researchers who are perhaps less visible in the field. Merton notes that while he 
coined the term to refer to the greater recognition scientists of higher rank receive for 
their discoveries, it undoubtedly has implications for the communication system (e.g. 
visibility), as well as for resource distribution (e.g., grant funds). The Matthew thesis fits 
well within the framework of the normative structure of science (Merton, 1 973b) and its 
system of social stratification (Cole and Cole, 1973). And, although the effect may be 
pervasive in all fields, Zuckerman and Merton point out that, in all likelihood, it operates 
more strongly in less codified fields such as the social sciences and humanities. In these 
disciplines, “the personal and social attributes of scientists are more likely to influence 
the visibility of their ideas and the reception accorded them” (Zuckerman and Merton, 
1973, p. 516). 

Overall, the Mertonian treatment of credit allocation is consistent with the 
neoclassical economical view of early “entrepreneurial capitalism,” where scientists 
operate on a free market and try to maximize their utility. Another economics model — 
the exchange system of gift-giving in primitive societies and other settings (Hyatt and 
Hopkins, 1998) — dominates Hagstrom’s conception of scientific recognition. In a 
nutshell, scientists give away information expecting to be rewarded by field recognition 
for their contributions: “social control in science is exercised in an exchange system, a 
system wherein gifts of information are exchanged for recognition from scientific 
colleagues” (Hagstrom, 1965, p. 52). 




ioll 



Unlike Hagstrom who postulates a mechanism analogous to primitive gift 
exchanges, Latour (1986) argues that the control system of science operates similar to, 
and apparently derived from, the Marxist political economy of capitalism. In Latour’s 
view, scientists are interested in gaining credit or credibility mainly because it gives them 
access to other resources, which, in turn, can be translated into further credit. The 
resulting image is that of a perpetual cycle where the accumulation of credit and faster 
rates of turnover become ends in and of themselves. This “cycle of credibility” involves 
conversions between different forms of capital (e.g., money, equipment, data). 

Latour’s model strives to overcome a weakness in Hagstrom’s and Bourdieu’s 
(1975) theories, namely their failure to consider demand (of scientists for each other s 
work). In Latour’s view, the researcher as homo economicus gets caught up in the 
objective of market activity “to extend and speed up the credibility cycle as a whole” 
(Latour and Woolgar, 1986, p. 207). 

2.3 Scientific Productivity and the Life Course 

There is a strong support for the thesis that academic career trajectories and 
especially promotion are significantly affected by productivity in terms of both quantity 
and quality of publications. Such a relationship is perceived as a confirmation of the 
Mertonian normative model of scientific knowledge production and particularly the 
operation of the “universalism” norm. Empirically, it has been demonstrated that 
allocation of citations follows a “repayment of intellectual debt mode, rather than a 
social constructivist “network” perspective (Baldi, 1 998). Promotion or the achievement 
of a higher rank in science is explained by institutional prestige, at least so is true in the 





case of academic psychologists (Hurlbert and Rosenfeld, 1992). Rank advancement has 
been proven to depend more on the sheer quantity than quality of publications for 
university departments in biochemistry (Long et al., 1993). The results from event history 
analysis of promotional patterns that Long and associates report also indicates that the 
likelihood for promotion is lower for women than for men for a job change from assistant 
to associate professorship. However, when a battery of control variables was added to 
the model, the gender difference was cut in half and was no longer statistically 
significant. Of course, although gender differences in research productivity have been 
well documented, there have been big controversies regarding the explanation of these 
differentials. This has led some authors to label the phenomenon as ‘the productivity 
puzzle’ (Cole and Zuckerman, 1984; Xie and Shauman, 1998). Field effects also account 
for a significant amount of variation in productivity (Bonzi, 1992) and, consequently, on 
academic promotion. 

Life cycle models view the careers of scientists as a longitudinal function of the 
individual’s skill levels and his or her incentives to act productively (Diamond, 1984; 
1986). At earlier stages of career building, productivity incentives are strong while skills 
are growing. At the middle stages (and sometimes even earlier), both incentives and 
skills are strong as productivity peaks. And at later stages, both begin to wane, as does 
productivity. The concept of a career life cycle originated in human capital theory from 
an economics tradition (Becker, 1963). Human capital theory sought to relate 
investments in human beings (education, training, job and life experiences, and personal 
health) to an individual’s earnings trajectory. 



In the scientific life cycle model. Levin and Stephan (1991) report that scientific 
productivity follows one of two general patterns (depending on scientific discipline): one 
where productivity simply declines with age, the other where it increases at first but then 
declines with age. Although there is plenty of empirical evidence to support this notion 
of diminishing marginal rates of productivity, such models fail to explain much variation 
in productivity (Stephan, 1996). Moreover, as Stephan and Levin have pointed out, many 
of these life-cycle models lack sufficient attention to the research process and the 
institutional setting of the process (Stephan and Levin, 1997). 

Researchers have also called attention to the role of early career collaboration and 
mentoring as spurs to longer-term scientific productivity. Long and McGinnis (1984) 
found significant and lasting effects of predoctoral collaboration with mentors on the 
careers of biochemists. The productivity of the mentor was positively and strongly 
related to the biochemists’ own publication productivity six years later. For students who 
had not collaborated with their mentor, there was no relationship. Similarly, Reskin 
(1977), studying chemists who obtained their Ph.D. in the late 1950s, found graduates 
from higher “caliber” departments were more likely to have collaborated with their 
doctoral mentor and showed higher productivity after their first postdoctoral decade than 
graduates from lesser-prestige departments. Zuckerman (1977) found that Nobel prize 
winners viewed their doctoral apprenticeship as crucial to their later success and, 
specifically, in building broad skills such as proper standards of achievement, tastes in 
choice of research problems, and confidence in their work and abilities. 

Life course models can be thought of as an enhancement or conceptual expansion 
of life cycle models. Elaborated by Elder (1994), the life course paradigm views 




14 



individual lives as affected by the historical period in which events occur, the 
developmental timing and sequence of events, and the involvement of the individual in 
relevant social relationships. Elder refers to the concept of human agency, which— as 
applied to science — can be thought of as the unique set of abilities that each scientist uses 
to translate his or her training and skills into scientific outputs. All individuals have 
“human agency,” although in different mixes. In a sense, human agency is a recognition 
that individuals vary in the predispositions (both strengths and weaknesses) they bring to 
the construction of a life course. Elder warns, however, that life course is more than just 
human agency. It is human agency constrained by developmental timing and history 
effects. 

The most important contribution that life course models have made to the 
understanding of the scientific careers and to S&T human capital, for that matter, is the 
notion that human lives are linked, or interdependent with other each other, and not just 
statically— but dynamically over time. Merton ([1965] 1993) recognized this in titling 
his book, On The Shoulders of Giants, in which he illustrates how Newton made his 
intellectual advances using the contributions of his scientific peers and forefathers. The 
life course concept illustrates the dynamic form of learning and communication among 
individual scientists and the meso and macro social contexts in which they are engaged. 
It is not completely socially deterministic, but nor does it rely strictly on individual 
reductionism. 




14 15 



3.0 Framing the Research Issues: Some Practical Concerns 

Very few studies have employed CVs as data sources about trends in job mobility 
in science. Typically, CVs are used as a supplemental source of information that serves 
to fill in the gaps from other documents (Long et al., 1993; Gomez-Mejia and Balkin, 
1992). Even when CVs 1 are utilized as the primary or only data source, their advantages 
or disadvantages are rarely discussed (Bonzi, 1992). The notion of using CVs as a 
research tool is hardly a novel idea. But the actual utility of CV s lies in answers to some 
quite practical questions and resolution of some fundamental methodological issues. 



3.1 What are the labor issues? 

A wealth of information is provided in most CVs but the coding of the 
information and its entry into a database is not at all straightforward. When one 
considers that some CVs include hundreds of publications and conference papers, many 
with multiple authors, the costs of labor become apparent. The options are few. If one 
wishes to capture almost all the information in a CV into standard databases, the 
enterprise likely involves thousands of observations per case. This requires a small army 
of labor, well trained and perhaps not all “low end” inexperienced data entry personnel. 
A second option is to mechanize as much as possible, through scanners, but this too has 
substantial labor and set-up cost. The final option, almost inescapably, must be pursued, 
limiting data capture. Absent prodigious data entry resources, the only option is to forgo 
much data or to categorize data at a relatively high level of abstraction (e.g., count 



1 Methodologically, CVs have been found to closely match information from other secondary sources such 
as the American Psychological Association’s directory. Nevertheless, these other secondary sources have 
been shown to undercount the number of published journal articles as compared to CVs (Heinsler and 
Rosenfeld, 1987). 



ERjt 



15 



16 



articles). The trick of the trade, then, is to optimize time, data capture, and labor. The 
hazards include insufficient culling, poorly predicted labor requirements, or settling on 
data at so high a level of aggregation that inferences are obscured. Can one develop 
heuristics or some empirical base for making such decisions? That is one of our concerns 
in the paper and the overall project. 

3.2 How to Operationalize the CV data? 

With anything less than complete data capture, the particular operationalization of 
CV data becomes vital. Even after whole sections of data are dismissed (in our case, 
such sections as conference papers, courses taught, internal working papers), one still 
must grapple with measuring the remainder. Is it important, for example, to capture not 
only article publications but also author numbers and author order? How does one 
represent data in more economical indices and, more to the point, how does one know 
which indices are most useful without sufficient original data to employ in indices? Are 
data best represented in arrays, across time, or in cross-sectional detail? To be sure, 
many of these answers depend upon specific hypotheses of specific studies, but if the CV 
database is to serve as a general resource for multiple research objectives, specific 
hypotheses provide little relief. 

3.3 Where and how does one obtain CVs? 

Virtually every scientist and engineer has a CV. But how does one obtain it? 
One way is simply to write a letter or an email message requesting one. Least obtrusive 
is simply going to public domain websites and downloading the publicly available data. 



ERiC 



16 



17 



How does one know the yield and peculiarities of return from each of these various 
approaches? 

3.4 Are CV’s consistent? 

An interesting problem is “which CV does one obtain?” First there is the time 
issue. If one wishes to examine CVs over time, one finds that most people do not keep 
old CVs, only recent ones. This is a problem when one considers that a great many CVs 
get truncated (e.g., “publications for past ten years”) and that the information that is 
important in an early career (e.g. all conference presentations) may be unimportant to the 
scientists later and may disappear from the CV. The results are possible differences in 
time, periodicity, and cohort. Interestingly, the availability of CVs on the web has been 
helpful and hurtful to those interested in the CV as data. The popularity of the web has 
meant that a great many more CVs are accessible, but the institutionalization of websites 
has led to a stylistic conformance of CVs, which is not itself a problem, and, typically, 
significant abridgment, which can be a great problem. If the CV on the web is typically 
an institutional rather than individual marketing resource, the rational marketing 
approach is succinct information about more people, rather than detailed information 
about particular people. 

3.5 How to link to benchmarks and secondary data? 

A great advantage of CV data is that it is so easy (conceptually) and useful to link 
to cognate data. The availability of a wide array of citation data through the Science 
Citation Index is extremely valuable. These same databases also include information on 




17 18 



the “power index” (i.e., the likelihood of citation) of journals. Similarly, the aggregate 
data provided in the SESTAT database of the National Science Foundation also serves as 
a potentially fruitful linkage. The problem, of course, is that these activities multiply data 
entry and manipulation costs another order of magnitude. Furthermore, the decision to 
use such benchmarks and cognate data requires making significant “up front” decisions 
on data collection strategies. 

3.6 What is the coding validity and reliability? 

By most any standard, the coding of more than a few CVs is a daunting task. We 
know that coding error rates from relatively tractable survey data range from about 5-10 
percent (Fowler, 1988). What about more difficult, less obvious CV data? While coding 
errors can at least be determined with some ease, it is not clear even which standard is 
best for coding reliability. Moreover, good measures of coder reliability require a good 
number of coders, again accelerating costs. Most important of all, however, is coding 
validity. Except for the most straightforward issues, CV coding is almost always sure to 
cause problems for any but the best-trained eye. Explaining to a coder how to deal with 
visiting professors working at (apparently) three different places, in two sites, with three 
ambiguous titles requires time, patience, and imagination. For example, the difference 
between a postdoc and a fellow may be vital in some instances, not others. And how 
does one determine if a proceedings publication is consequential when working in a 
number of very different fields. Is it possible to conveniently detail such matters for 
coders in anything less than a 50-page codebook? 




1819 



4.0 Research Methodology 



4.1 CV Selection Strategy 

Three approaches were used to obtain an expected sample of 350 CVs: a “targeted 
agency” search, a “targeted industry” search, and a “passive Internet” search. For both 
targeted searches, a direct email message was sent to potential respondents who had 
either recently conducted funded research or published in the area of biotechnology. 1 
Respondents were asked to submit a full CV via email or fax, although a few respondents 
actually preferred to mail a hard copy because they felt there may be security issues in 
sending us an electronic copy. For the passive search, various Internet search engines 
and search phrases were used to identify a subgroup of web-posted CVs. Of the sample 
group, 50 CVs were solicited from industry scientists and engineers, 200 from academic 
researchers, and 1 00 from the web. 

4.2 Collection Procedures for Targeted Agency Search. 

A sample of 200 researchers funded by the National Science Foundation’s (NSF) 
Biotechnology program and working at US institutions was obtained from NSF’s awards 
database. This strategy has the main advantage of identifying a group of active, 
biotechnology researchers whose email addresses are provided by NSF. An email was 
sent inviting the researchers to submit their full CV via email. Approximately 20 percent 
of the email addresses taken from the database were erroneous or obsolete and were 
returned. The researchers attempted to obtain a current address for all of undelivered 




!?0 



